Main figures
Cohort last updated: 2020-10-29 => 2021-04-02 patient exclusions => 2021-07-26 diagnoses updates and patient exclusions
Diagnoses
For fun: Adding the additional patients with TSFs:
previous plots with validation and discovery separated (hidden)
Validation figure comparing Fusion-sq to FFPM >0.1
Key message: clinically relevant fusions (color) can be detected also when lowly expressed (dots below the line). Also, filtering on FFPM > 0.1 is not a good strategy for identification of drivers, since then other fusion predictions remain as well (grey jitter points above the line). In addition, specificity of our approach: only few additional high-confidence tumor-specific fusions are identified (black dots)
Note:Reciprocal fusions removed for display purposes. With jittered points for all predictions. Note: “SS18–SSX1” == “AC091021.1–SSX1” Note: IGH-@-ext–MYC reciprocal only for PMCID340AAO
Previous simpler versions (hidden)
Validation: RNA-DNA distance and tool concordance
Fusions and SVs can be matched at a large variety of distances between the RNA and DNA breakpoints due to using intron-exon gene structure (upstream and downstream gene respectively on the x and y axis). In comparison, even a large 10 kb distance would not be sufficient in all cases (red lines). In addition, all clinically relevant fusions are detected by at all three tools at nucleotide resolution at the precise intervals (adj intron/flank/sj) except for ASPSCR1–TFE3 which requires composite for Manta and shows slight differences between the tools.
See other markdown for numbers on precision.
=> TODO: update with overlaps and bp distances. Figure no longer correct
2021-05-26 validation set = clin rel fusion in right orientation unless no other is available need per tool the DNA-RNA distance of the high conf selected SV
Tumor-specific SVs per patient
Key message: few fusion predictions per patient with WGS support. patient-specific
Tumor-specific fusions in discovery set only
Fusions per patient and genomic instability
Percentiles for gene fusion burden
## [1] "somatic_hc cnt"
## 50% 90% 95% 99%
## 0.00 3.00 5.55 12.84
## [1] "Used in the paper: somatic_low_af_hc cnt ,"
## 50% 90% 95% 99%
## 0.50 3.00 7.00 16.13
## [1] "any somatic_low_af cnt (includes low confidence),"
## 50% 90% 95% 99%
## 1.00 4.00 7.00 17.55
Fraction of genome altered distribution and median in red
Plots relating FGA, somatic/low-af fusion burden (high confidence) and CNA
per patient find patterns fga/cna/fusion burden
Combine predicted and high conf in 1 figure
## [1] TRUE
Predicted ffpm >0.1
## [1] 2.55
With all predicted fusions
## [1] 22.6